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1. 


Abstract 


MEASURING  CONTROL  STRUCTURE  COMPLEXITY 
THROUGH  EXECUTION  SEQUENCE  GRAMMARS* 


B.  J.  MacLennan 
Naval  Postgraduate  School 
Monterey,  CA  93940 


Recession  For 
NTIS  GRAa’i  ~ 
Drrc  TAB 

Unannounced 

Justification.. 


A  method  for  measuring  the  complexity  of  control  structures 
is  presented.  It  is  based  on  the  size  of  a  grammar  describing  the 
possible  execution  sequences  of  the  control  structure.  This 
method  is  applied  to  a  number  of  control  structures,  including 
Pascal's  control  structures,  Dijkstra's  operators,  and  a  struc¬ 
ture  recently  proposed  by  Parnas.  The  verification  of  complexity 
measures  is  briefly  discussed. 


2.  Introduction 

Many  questions  face  a  language  designer.  Is  a  "while-do™ 
better  than  a  "repeat-until"?  Is  a  "do-od"  more  complex  than 
these?  How  does  a  " i f-elsi f-else"  structure  compare  with  nested 
" i f-then-else"s?  To  this  end  it  is  useful  to  have  a  complexity 
measure  for  control  structures  that  can  serve  as  a  figure  of 
merit  in  making  these  determinations. 


*  The  work  reported  herein  was  supported  by  the  Foundation 
Research  Program  of  the  Naval  Postgraduate  School  with  funds 
provided  by  the  Chief  of  Naval  Research. 
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In  this  paper  we  take  the  view  that  the  complexity  of  a  con¬ 
trol  structure  is  related  to  the  complexity  of  the  corresponding 
language  of  execution  sequences.  The  complexity  of  this  language 
can  then  be  measured  by  determining  the  structural  complexity  of 
the  corresponding  grammar  using  techniques  described  in  14],  The 
motivation  for  this  technique  is  the  assumption  that  to  under¬ 
stand  a  control  structure  a  programmer  must  internalize  the  pos¬ 
sible  control  sequences  defined  by  that  control  structure.  A 
further  assumption  is  that  the  difficulty  of  doing  this  is 
approximated  by  the  size  of  a  grammar  describing  this  class  of 
execution  sequences. 

In  the  next  section  we  informally  present  this  technique  by 
measuring  the  size  of  a  conventional  extended-BNF  grammar  for  the 
language  of  execution  sequences.  The  measurements  depend  on 
details  of  the  concrete  BNF  notation  that  do  not  seem  to  be 
relevant  to  the  control  structure's  complexity.  Therefore,  in 
the  following  section  these  measurement  techniques  are  refined  by 
measuring  an  abstract^  grammar  for  the  language;  this  eliminates 
irrelevant  details  of  the  concrete  syntactic  notation.  Finally, 
we  tabulate  the  complexity  of  a  number  of  common  control  struc¬ 
tures  and  discuss  some  limitations  of  the  method. 


i.  By  this  we  mean  a  grammar  expressed  in  an  abstract  rather 
than  a  concrete  form,  not  a  grammar  for  an  abstract  language 
as  opposed  to  a  concrete  language. 
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3.  Concrete  Grammar  Size 

3. 1  Conditionals 

We  will  begin  our  analysis  with  a  simple  control  structure, 
the  Pascal  i f-then  statement.  Consider  an  if-then  such  as  this: 

i f  B  then  S 

and  consider  the  possible  execution  sequences.  These  execution 
sequences  can  be  written  as  reqular  expression,  which  use  the 
operations  catenation,  union,  Kleene  cross,  and  Kleene  star. 
Note  that  B  will  always  be  executed,  but  S  will  be  executed  only 
if  B  was  true.  Therefore  the  possible  execution  sequences  are  BS 
and  B,  depending  on  whether  B  was  true  or  not  (we  represent  con¬ 
secutive  execution  by  catenation) .  Hence,  the  set  of  possible 
execution  sequences  is 

E { ijf  B  then  S)  *  BS  +  B 

where  '+'  represents  the  union  of  sets  of  execution  sequences. 
If  we  then  define  the  complexity  C{X}  of  a  construct  to  be  the 
si2e  of  the  execution  sequence  grammar  of  X, 

C{X)  »  I E {X} | 

then  we  can  compute  the  complexity  of  the  if-then.  To  measure 
the  complexity  of  an  execution  sequence  grammar  we  will  take  a 
very  naive,  concrete  view,  and  count  the  tokens  in  the  grammar. 


Thus , 


C{ i f  B  then  S} 


IBS  +  B I 
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since  the  tokens  are  *B',  'S',  '  +  ',  and  'B'. 

Next  we  will  consider  the  full  i f-then-else : 

i f  B  then  S  else  T 

In  this  case  it  can  be  seen  that  B  is  always  executed,  followed 
by  either  S  or  T.  Thus  the  possible  execution  sequences  are  BS 
and  BT,  which  we  can  factor  and  write  B(S  +  T) .  The  assumption 
here  is  that  the  complexity  is  related  to  the  shortest  grammar 
for  the  language  of  execution  sequences.  Thus  the  complexity  of 
the  if-then-else  is: 

C{ i_f  B  then  S  else  T)  -  |B(S+T)  |  »  fi 

It  is  a  little  more  complex  than  the  simple  if-then,  as  is 
expected . 

Finally,  we  will  analyze  the  case-statement: 

case  C  of  (Sj j  S 2»  •  •  •  j  ) 

Clearly,  E  must  be  executed  first,  and  then  one  of  the  Sj .  The 
complexity  is  easy  to  calculate: 

| E (S •••  +S^) I  *  2n+2 


where  n  is  the  number  of  cases. 
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3. 2  Iterative  Constructs 

Next  we  will  analyze  the  Pascal  repeat-until  statement. 
Consider  a  repeat-until  such  as  this: 

repeat  S  until  B 

The  effect  of  this  is  to  execute  S  until  B  evaluates  to  true. 
Therefore,  we  execute  S  and  then  B.  If  B  is  false,  we  again  exe¬ 
cute  S  and  B.  This  process  continues  until  B  becomes  true  (which 
must  eventually  happen  in  a  terminating  program) .  Therefore  the 
possible  execution  sequences  can  be  written 

SB  +  SBSB  +  SB SB SB  +  ... 

where  concatentation  denotes  sequential  execution  and  '+'  can  be 
read  as  "or".  Really,  the  '+'  denotes  set  union,  since  the  above 
expression  defines  the  set  of  possible  execution  sequences. 
Using  exponential  notation,  the  execution  sequences  for  the 
repeat-until  can  be  abbreviated: 

SB  +  (SB)2  +  (SB) 3  +  . . . 

Using  the  Kleene  cross  notation,  this  infinite  union  can  be  writ¬ 
ten 

(S8)  + 

and  can  be  read  one  or  more  repetitions  of  the  sequence  SB.  This 
agrees  with  the  way  we  think  of  the  behavior  of  a  repeat-until. 
The  complexity  of  this  construct  is  measured  simply: 


i 
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C( repeat  S  until  B)  «  |(SB)+|  ■  5 

Very  much  the  same  analysis  can  be  applied  to  Pascal's 
while-do: 


while  B  do  S 

This  construct  executes  B;  if  the  result  is  false  it  terminates, 
otherwise  it  executes  S  and  loops  back  to  test  B  again.  There¬ 
fore  we  can  write  the  execution  sequences 

B  +  BSB  +  BSBSB  +  B SB SB SB  +  . . . 


That  is, 

B  +  B  (SB) 1  +  B  (S8)  2  +  B  (SB)  3  +  ... 

Now,  if  we  use  €  to  represent  the  null  execution  sequence,  then  B 
can  be  factored  out  of  the  above  expression: 

B(  6  +  (SB)1  +  (SB)2  +  (SB)3  +  ...  ] 

This  can  be  simplified  with  Kleene's  cross: 

B[  6  +  (SB) +  ] 

It  now  becomes  apparent  that  this  can  be  simplified  even  further 
by  using  the  Kleene  star  notation,  since 

C*  *  6  +  C+ 


Thus,  we  can  compute  the  complexity  of  the  while-do: 

-  t  B (SB) *  |  »  6 


C (while  B  do  S} 
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Notice  that  the  while-do  is  slightly  more  complex  than  the 
repeat-until  because  of  the  leading  inital  test  of  the  condition. 
It  would  probably  be  more  intuitive  to  ignore  the  final  failure 
test  of  B  and  analyze  the  while-do  as: 

C { while  B  do  S}  =  |(BS)*|  «  5 

which  agrees  with  our  intuitive  notion  that  a  while-do  and 
repeat-until  have  about  the  same  complexity.  It  is  not  known  at 
this  time  which  measurement  technique  is  correct. 

Since  we  have  considered  leading-decision  loops  and 
trail ing-decision  loops,  we  will  next  analyze  mid-decision  loops. 
A  mid-decision  loop  has  the  form: 

loop  S  exit  when  B;  T  end  loop 

The  meaning  of  this  is:  execute  S,  then  test  B,  if  B  is  true 
terminate  the  iteration,  otherwise  execute  T  and  continue  loop¬ 
ing.  Mid-decision  loops  are  often  useful  in  search  operations. 
It  is  easy  to  see  that  the  execution  sequences  are: 

SB  ♦  SB.TSB  +  SBTSBTSB  +  .  .  . 

or  in  general, 

SB(TSB) * 

The  resulting  complexity  is 

C{loop  S  exit  when  B;  T  end  loop)  ■  (SB(TSB)*!  ■  8 
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As  would  be  expected,  this  is  more  complex  than  either  the  lead¬ 
ing  or  trailing  decision  loops. 

So  far  we  have  described  execution  sequences  using  just  the 
operators  used  in  regular  expressions  (viz.,  Kleene  cross,  Kleene 
star,  union,  and  catenation).  Recently,  however,  several 
extended  BNP  notations  (see,  for  example,  [2,  3,  5,  81)  have 
adopted  an  operator  that  expresses  a  very  common  configuration, 
the  delimited  sequence .  An  example  is  "a  sequence  of  names 
separated  by  commas.”  Using  the  regular  expression  operators, 
this  would  have  to  be  written 

<name>  (,  <name>)* 

which  requires  repeating  <name>.  The  delimited  sequence  notation 
allows  this  to  be  expressed  directly: 

<name>  ,  ... 

In  general,  ’CD...'  means  the  class  of  all  non-empty  sequences  of 
Cs  alternating  with  Ds;  that  is,  C(DC)*.  Using  this  notation, 
which  expresses  a  very  simple  structural  idea,  the  leading- 
decision  and  mid-decision  loops  have  the  complexity: 

C (while  B  do  S)  »  IBS... |  -  3 

C(loop  S  exit  when  B;  T  end  loop)  •  I (SB)T. . . I  »  8 

This  may  seem  like  an  ad  hoc  definition  of  an  operator  to  sim¬ 
plify  the  description  of  these  execution  sequences.  For  this 
reason  we  have  restricted  our  attention  to  notations  that  have 
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already  proved  useful  In  describing  sets  of  sequences.  As  we 
have  said,  the  delimited  sequence  operator  has  been  independently 
proposed  by  several  authors  as  embodying  a  useful  configuration. 
Whether  it  should  be  used  in  measuring  control  structure  complex¬ 
ity  remains  an  open  question. 

3. 3  Dijkstra's  Constructs 

In  this  section  we  analyze  Dijkstra's  i f-f i  and  do-od  con¬ 
trol  structures  [1].  The  if-fi  has  this  form: 

if.  Bl“*sl  □  □  Bn”*sn  fl 

The  guards  B^,  . ..,  Bn  are  evaluated  non-deterministically.  If 
one  or  more  evaluates  to  true,  then  one  of  the  corresponding 
statements  is  chosen  and  executed.  If  none  of  the  guards  is 
true,  then  an  error  condition  exists  and  the  program  aborts. 
Thus,  the  possible  execution  sequences  are: 

B1S1  +  B2S2  ♦  •••  +  BnSn 

The  size  of  this  expression  is  3n-l,  so  the  complexity  of  the 
if-fi  is: 

C{i_f  •  •  •  Bi-»S1  ...f_n  »  3n-l 

The  do-od  is  an  iterative  construct  patterned  on  the  if-fi. 
It  has  the  form: 


do  Bj— O  •••  Q  B^— 4Sn  od 


On  each  iteration  the  guards  are  evaluated  non-deterministically. 


If  none  of  them  are  true,  the  loop  terminates.  Otherwise  one  of 
the  corresponding  Sj  is  selected,  and  the  loop  repeats.  It  is 
easy  to  see  that  the  execution  sequences  are 

<B1S1  +  “•  +  Bnsn>* 

Therefore  the  complexity  of  the  do-od  is: 

C { do  ...  B ...  pd }  ■  3n+2 

which  is  approximately  the  same  as  the  lf-fi. 

It  is  instructive  to  compare  the  complexity  of  the  if-fi 
(3n-l)  with  that  of  the  more  conventional  if-elsif-else  (or 
multi-branch  conditional) .  Effectively,  we  are  comparing  the 
complexities  of  non-deterministic  and  deterministic  conditionals. 
The  if-elsif-else  has  the  form: 

if  Bj  then  Sj  elsi f  B2  then  S2  •••  else  E  end i f 

This  is  executed  strictly  sequentially;  if  B ^  is  false,  then  B2 
is  tried,  if  B2  is  false,  then  Bj  is  tried,  and  so  forth.  This 
is  equivalent  to  nested  if-then-else  statements.  It  is  easy  to 
write  down  the  execution  sequences: 

B  jS^  +  BlB2^2  ^  .  •  •  —  B1B2°  *  *B^E 
The  length  of  this  regular  expression  is 

2  +  3  +  ...  +  (n+1 )  -  t (1+1)  ■  h2;— n 

This  is  not  the  complexity,  however,  since  this  regular 
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expression  can  He  sinplified  by  factoring  to 

B^(Sj  +  ^2^2  +  Bn  +  E^  ***  )) 

We  can  find  the  length  of  this  expression  inductively.  Let 

Ei  *  Bi(Si+Ei+l) 
for  i<n,  and  En+j  *  E.  Then, 

I E i I  *  5  +  lEi+1l 

and  |En+1 I  =  1.  Therefore  |Ej|  =  5n+l.  In  summary,  the  complex¬ 
ity  of  the  n-branch  if-elsi f-else  is 

C{ i f  •••  elsi f  Bj  then  •••  else  E  end  if }  =  5n+l 

Thus,  the  complexity  of  the  non-deterministic  if-fi,  3n-l,  is 
considerably  less  than  that  of  the  deterministic  i f-elsi f-else , 
5n+l . 

4.  Abstract  Grammar  Size 

4.1  Introduction 

The  reader  will  have  probably  noticed  that  our  complexity 
measurements  include  aspects  of  the  regular  expression  notation, 
such  as  parentheses,  that  on  an  intuitive  basis  are  not  very 
relevant.  Previous  work  (4,  6]  has  shown  that  better  measure¬ 
ments  are  obtained  if  an  abstract  form  of  the  grammar  is  meas¬ 
ured,  rather  than  some  concrete  representation,  such  as  we  have 
used  in  the  first  section.  This  approach  will  count  operators 
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that  alter  the  sets  of  execution  sequences,  such  as  and 

catenation,  while  ignoring  those  that  do  nothing,  such  as 
parentheses.  Previous  work  has  also  shown  that  it  is  best  to 
count  multi-armed  alternations  as  a  single  operator,  rather  than 
several.  That  is,  an  expression  such  as 

Sj  +  Sj  +  « « •  +  Sn 

(which  would  normally  be  counted  as  2n-l)  will  be  analyzed  as 
though  it  were  written 


j,  S2»  •  •  • ,  Sjj] 

which  gives  it  a  count  of  n+1  (n  for  the  Sj  and  1  for  the  I) . 
Since  we  are  counting  the  operators  that  "do  something'*  we  will 
now  have  to  also  count  catenation,  so  we  will  write  ST  explicitly 
as  S*  T 

4. 2  Recomputation  of  Complexities 

In  this  section  we  will  recompute  the  complexities  of  the 


constructs  analyzed  using  concrete  grammars.  The 

structures  are  trivial: 

Pascal  control 

C{if  B  then  S} 

|B*S+B| 

5 

C{if  B  then  S  else  T} 

IB* (S+T) | 

5 

C {while  B  do  S} 

1  0*S)*| 

4 

Cfrepeat  S  until  B} 

1  (S  *  B)  +  | 

4 

C{loop  S  exitwhen  B;  T  end  loop} 

1  (S * B) T. . .  | 

5 

Cfcase  E  ( . .  .Sj . . .)  } 

|E-St _ Sj  . 

..11  n+3 

The  results  of  these  measurements  are  not  very  different  from 
those  based  on  the  concrete  grammar. 

Next  we  will  consider  Dijkstra's  constructs,  which  will  make 
use  of  the  £  operator.  The  if-fi  is  analyzed 

C { i f  ...  B  j  j  ...  f i }  ■  |  ...,  Bj  *Sj ,  ...  1  |  ■  3n+l 

The  result  is  almost  the  same  as  with  the  concrete  grammar;  the 
addition  of  the  catenation  operations  has  compensated  for  the 
omitted  unions  (+) . 

The  do-od  construct  is  exactly  analogous: 

C {do  •••  B1**^Si  ...  od}  -  IH  ...,  Bj’Sj,  ...  ]*|  »  3n+2 

The  execution  sequences  of  the  if-elsi f-else  are: 

Bj*  ( S  j  +B  2*  (S2+  ...  Bfl*(  Sjj+A  )  «  *  • )  ) 

In  this  case  the  inductive  equation  is 

Ei  *  Bi * (si+Ei+i) 

Therefore  each  clause  adds  4,  resulting  in  a  total  complexity: 

C{if-elsif-else)  «  4n+l 

This  is  a  significantly  lower  measurement  than  that  obtained  with 
the  concrete  grammar  (5n+l),  largely  owing  to  the  abstract 
grammar's  insensitivity  to  parentheses. 
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5.  The  Parnas  It-Tl  Construct 
5.1  The  Non-Deterministic  It-Tl 

In  this  section  we  analyze  the  complexity  of  a  new  control 
structure  proposed  by  Parnas  [7].  This  control  structure  is  a 
combination  of  Dijkstra's  if-fi  and  do-od  structures  and  has  the 
form: 


i_t  v  ...  V  Bn-»SnXn  ti 

The  Xj'is  either  an  up-arrow  indicating  continuation  of  the 
iteration  or  a  down-arrow  indicating  termination  of  the  itera¬ 
tion.  The  semantics  of  the  it-ti  is  as  follows:  The  guards  are 
evaluated  non-deterministically .  Out  of  the  ones  that  evaluate 
to  true,  one  is  chosen  and  its  corresponding  statement  Sj  is  exe¬ 
cuted.  When  this  statement  has  completed  the  continuation  Xj  is 
considered.  If  it  is  repeat  (an  up-arrow)  then  the  it-ti  loops 
again;  if  it  is  break  (a  down-arrow)  then  the  it-ti  terminates. 

Since  the  it-ti  described  above  is  non-deterministic ,  the 
order  of  its  arms  can  be  changed  without  altering  its  meaning. 
This  simplifies  the  analysis  of  the  it-ti  because  the  repeating 
arms  and  the  breaking  arms  can  he  grouped  together.  We  will 
assume  that  there  are  m  repeating  arms,  and  that  they  are  moved 
to  the  front  of  the  it-ti.  The  complexity  is  then  easy  to  calcu¬ 


late: 
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C{it  Bj-^Sjrepeat  V 


V  B_-*S_repeat 


V  Bm+^-»Sm+1 break  V  •••  v  Bn— »Snbreak  tl } 

«  iTtBj'Sj, . . . *Bm*Sm]  *  £[Bm+1*Sm+1, . . . ,Bn’Sn] I 
*  3m+2  +  3  (n-m) +1 

-  3n  +  3 

Thus  the  complexity  is  comparable  to  that  of  the  if-fi  and  do-od. 
5. 2  The  Deterministic  It-Ti 

In  this  section  we  analyze  a  variant  of  the  it-ti  defined  by 
Parnas  called  the  deterministic  it-ti.  This  has  the  form 


i t  Bj^SjXj  else  or  •••  eise  or  »n-*snAn 


else  or  B  — >S„X_  ti 


In  this  construct  the  guards  are  executed  strictly  sequentially. 
In  other  words,  if  Bj  is  true,  then  Sj  is  executed  and  continua¬ 
tion  action  X^  is  taken;  otherwise  testing  continues  with  82*  As 
for  the  non-deterministic  it-ti,  an  error  condition  exists  if 
none  of  the  guards  is  true. 

The  analysis  of  the  deterministic  it-ti  is  considerably  more 
complicated  than  the  non-deterministic  since  the  arms  cannot  be 
rearranged  to  group  the  repeating  and  breaking  arms  together.  In 
fact,  each  different  arrangement  of  breaks  and  repeats  (i.e.,  of 
the  Xj )  effectively  defines  a  different  control  structure.  To 
keep  the  mathematics  tractable  we  introduce  several  abbrevia¬ 
tions.  The  notation 


E<xlx2-**xn> 


p  gfe 
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represents  the  execution  sequences  of  a  deterministic  it-ti  whose 
i-th  continuation  action  is  Xj.  We  will  use  'b'  to  represent 
•break',  'r*  to  represent  'repeat',  'x'  to  represent  either  'b' 
or  'r',  'X'  to  represent  a  sequence  of  either  'b's  or  'r's,  and 
'B'  to  represent  a  sequence  of  'b's.  These  notations  will  just 
be  used  inside  the  angle  brackets  of  E<...>. 

We  will  also  make  one  change  to  the  semantics  of  the  deter¬ 
ministic  it-ti  to  simplify  the  analysis.  If  none  of  the  guards 
are  true,  we  will  assume  that  the  it-ti  "falls  through"  like  a 
do-od.  Later  we  will  correct  the  formula  to  account  for  the  fact 
that  this  is  an  error  condition  in  Parnas'  formulation. 

The  formula  will  be  derived  by  an  inductive  process  starting 
with  the  degenerate  it-ti  that  contains  no  arms,  viz.,  i_t  ti . 
This  is  a  fall-through,  and  the  corresponding  execution  sequence 
is  the  null  sequence,  so 


E<>  »  6 

We  will  next  investigate  extensions  of  an  it-ti  formed  by  adding 
a  new  arm  to  the  beginning.  The  formulas  for  the  execution 
sequences  are  derived  by  a  variant  of  the  method  of  undetermined 
coefficients  suggested  by  R.W.  Hamming.  In  this  method,  the  gen¬ 
eral  form  of  a  formula  is  assumed  and  its  specific  coefficients 
or  parts  are  derived.  Deterministic  it-ti 's  are  of  two  sorts: 
those  that  contain  only  'break's  (and  are  hence  multi-branch  con¬ 
ditionals)  ,  and  those  which  contain  at  least  one  'repeat'.  The 
latter  we  will  assume  have  an  execution  sequence  of  the  form 
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L**T  +  U,  for  some  regular  expressions  L,  T,  and  U.  First,  how¬ 
ever,  we  will  address  it-tis  with  only  'break’s. 

Consider  an  it-ti  of  the  form  bB,  i.e.,  all  of  whose  arms 
are  'break's.  We  want  to  calculate  the  execution  sequences 
E<bB>.  Suppose  the  arm  corresponding  to  the  b  is  C-*S,  then  the 
possible  execution  sequences  of  bB  are  C*S  or  C*E<B>,  where  E<B> 
is  the  set  of  execution  sequences  of  the  reduced  it-ti  B.  This 
can  be  factored  giving, 

E<bB>  »  C* (S  +  E<B> ) 

Next  consider  an  it-ti  of  the  form  rB,  i.e.,  a  repeating  arm 
followed  by  all  breaking  arms.  Suppose  the  repeating  arm  is 
C-»S.  Then,  if  C  is  true,  S  will  be  executed  and  the  it-ti  will 
repeat.  Otherwise  the  it-ti  B  is  executed  (which  is  just  a 
multi-branch  conditional).  Therefore  the  possible  execution 
sequences  are 

E<rB>  -  (C*  S) **C*  E<B> 

Next  we  will  consider  extensions  of  an  it-ti  containing  at 
least  one  repeat,  X.  Thus  we  will  derive  E<xX>  from  E<X>.  By 
the  method  of  undetermined  coefficients,  we  will  assume  E<X>  to 
have  the  form  L**T  +  U  (since  by  assumption  it  contains  a  loop). 

Consider  first  the  case  of  adding  a  breaking  arm  C-»S;  we 
wish  to  calculate  E<bX>.  The  effect  of  this  it-ti  is  to  evaluate 
C;  if  it's  true  then  evaluate  S  and  break;  otherwise  continue 
with  the  execution  of  X.  Therefore  the  execution  sequences  are 
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E<bX>  »  (C*L) *‘T  +  C* (S+U) 

This  can  be  seen  to  have  the  form  L*’T+U. 

Next  we  will  consider  the  case  of  adding  a  repeating  arm 
C-*S;  we  wish  to  calculate  E<rX>.  The  effect  of  this  it-ti  is  to 
evaluate  C;  if  it's  true  then  evaluate  S  and  repeat;  otherwise 
continue  with  the  execution  of  X.  Therefore  the  execution 
sequences  are 

E<rX>  -  [C*  f S+L) 1 # *T  +  C*U 

Again,  this  has  the  form  L*‘T+lf.  our  use  of  the  method  of 
undetermined  coefficients  has  been  successful. 

It  is  now  a  routine  matter  to  calculate  the  complexity  of 
these  regular  expressions. 


|EO| 

m 

0 

| E<bB> | 

9 

|  E<B>  | 

+  4 

1 E<rB> I 

m 

|  E<B> | 

+  7 

|B<bX>| 

m 

|L|  + 

IT | 

|U|  + 

9 

-  I E<X> I  ♦  6 

|E<rX>| 

9 

1 L  |  + 

IT  |  + 

IUI  + 

9 

«  | E<X> |  +  6 

last  two 

equations 

follow 

from 

the 

fact  that 

I  E<X> 

1  -  1 

L|  ♦ 

IT  | 

♦  |U|  +  3 

since  E<X>  *  L**T  ♦  U. 

These  equations  can  now  be  solved  for  the  complexity.  Con¬ 
sider  first  the  case  of  an  it-ti  all  of  whose  arms  are  breaks. 


“T1  t-^*v*a*  * 


C{B}  »  |E<B>|.  You  can  see  from  the  equations  above  that  each 
arm  adds  4  to  the  complexity.  Therefore,  if  there  are  m  breaks 
b1b2***bm,  the  complexity  is 

C{bj*»*bm)  «  4m 

Next  consider  an  it-ti  with  one  repeat  followed  by  m  breaks. 
This  has  the  form  rb}...bm.  The  complexity  is 

C{rbj***bm}  »  C{bj . . «bm}+ 7  »  4m+7 

Finally,  we  have  the  case  of  adding  either  a  break  or  a 
repeat  to  an  it-ti  that  already  contains  a  mixture  of  breaks  and 
repeats.  Regardless  of  whether  the  new  arm  is  a  break  or  repeat, 
it  adds  6  to  the  complexity.  Therefore,  if  k  arms  are  added  the 
complexity  is  increased  by  6k: 

C{xj. . . x^rbj. . .bm)  *  fik  +  4m  +  7 

It  is  already  apparent  that  the  deterministic  it-ti  is  a  complex 
control  structure  since  there  is  a  factor  of  6  involved. 

To  be  able  to  compare  the  it-ti  with  other  control  struc¬ 
tures  it  is  useful  to  have  its  complexity  in  terms  of  n,  the 
number  of  arms.  Note  that  n  *  k+m+1  if  there  is  at  least  one 
repeat,  otherwise  n  *  m.  Therefore  if  there  are  no  repeats  we 
have 

C{B}  -  4n 

If  there  is  at  least  one  repeat  we  have 
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C{X}  -  6k+4m+7  -  4(k+m+l)  +  2k  +  3  ■  4n  ♦  2k  ♦  3 

This  is  still  not  a  very  convenient  form,  since  k  is  one  less 
than  the  number  of  arms  that  aren't  terminal  breaks,  a  rather 
unintuitive  quantity.  It  is  more  convenient  to  express  the  com* 
plexity  either  in  terms  of  m,  the  number  of  terminal  breaks,  or 
in  terms  of  s»n-m,  the  number  of  arms  that  aren't  terminal 
breaks : 

C{X}  *  6n-2m+l  *  4n+2s+l 

Since  m  can  vary  from  n  to  0  it's  easy  to  see  that  the  complexity 
of  the  deterministic  it-ti  can  vary  from  4n  to  Sn+1.  All  of 
these  are  considerably  more  complex  than  the  non-deterministic 
it-ti 's  3n+3. 

To  account  for  the  fact  that  Parnas'  it-ti  aborts  if  none  of 
the  guards  are  satisfied,  it  is  merely  necessary  to  add  an  addi¬ 
tional  breaking  arm  to  the  end  of  the  form  'true-*abort  break'. 
This  increases  the  complexity  to  *n-2m+5  (or  4n+2s+5) . 

Notice  that  the  deterministic  it-ti  is  the  first  construct 
we  have  encountered  whose  complexity  depends  on  another  parameter 
besides  n,  the  number  of  arms.  This  reflects  the  fact  that  the 
deterministic  it-ti  is  in  fact  a  family  of  control  structures, 
since  each  different  arrangement  of  repeats  and  breaks  defines  a 
different  pattern  of  control  flow. 
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6.  Conclusions 

The  complexities  calculated  for  the  various  control  struc¬ 
tures  are  summarized  in  the  following  table.  This  table  also 
show  the  complexity  per  arm  to  facilitate  comparisons  between 
structures  with  a  fixed  number  of  arms  (e.g.,  the  i f-then-else) 
and  those  with  a  variable  number  of  arms. 

control  structure  complexity  per  arm 


i f-then 

5 

5 

i f-then-else 

5 

2.  5 

while-do 

4 

4 

repeat-unti 1 

4 

4 

mid-decision  loop 

5 

2.5 

case 

n+3 

1  + 

multi-branch  if 

4n+l 

4+ 

if-fi 

3n-l 

3- 

do-od 

3n+2 

3+ 

non-deterministic  it-ti 

3n+3 

3+ 

deterministic  it-ti 

6n-2m+5 

4  - 

Figure  1.  Control  Structure  Complexities 
The  complexities  are  also  shown  graphically  in  the  following  fig¬ 
ure.  These  measures  seem  to  agree  with  our  intuitive  estimations 
of  the  relative  complexity  of  these  control  structures. 

Whenever  a  measure  such  as  this  is  proposed  the  question  of 
its  validation  must  be  asked.  In  other  words,  is  this  the 
correct  complexity  measure?  Complexity  is  used  in  many  senses. 
Perhaps  the  most  common  uses  relate  to  the  difficulty  of  under¬ 
standing.  That  is,  one  thing  is  more  complex  than  another  if  it 
is  more  difficult  to  understand.  The  implication  seems  to  be 
that  complexity  is  a  psychological  property  that  requires  psycho¬ 
logical  techniques  in  its  verification.  However,  this  is  not  the 


case 
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Figure  2.  Complexities  of  Various  Control  Structures 


An  analogy  may  help  to  clarify  the  issues,  when  we  hold  an 
object  in  our  hands  we  experience  a  psychological  property,  a 
sensation  of  weight.  This  property  depends  on  many  cir¬ 
cumstances,  including  the  shape  of  the  object,  how  long  it's 
held,  and  so  forth.  Similarly,  our  sensation  of  time  can  be 
quite  subjective  and  can  depend  on  many  circumstances.  Psycho¬ 
logical  weight  and  psychological  duration  are  valid  objects  of 
scientific  inquiry  and  in  fact  have  been  studied  by  psycholo¬ 
gists.  These  properties  are  analogous  to  psychological  complex¬ 
ity,  the  perceived  complexity  of  a  system. 

Although  our  first  notions  of  time  were  based  on  psychologi¬ 
cal  duration  and  our  first  notions  of  weight  on  psychological 
weight,  these  are  not  the  only  notions  of  time  and  weight  that  we 
now  use.  Physicists  have  discovered  notions  of  time  and  weight 
that  are  objective ,  i.e.,  that  are  independent  of  individual 
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psychologies.  Time  is  measured  by  clocks  even  though  we  realize 
that  there  is  often  only  a  loose  correlation  between  clock  time 
and  psychological  time.  Similarly,  the  concepts  weight  and  mass 
are  defined  and  measured  in  completely  non-psychological  terms. 
The  measurement  of  physical  duration  and  physical  weight  is  a 
problem  of  physics;  the  measurement  of  psychological  duration  or 
weight  is  a  problem  of  psychology,  as  is  the  establishment  of  the 
relation  between  the  physical  and  psychological  properties. 

Physicists  have  studied  the  physical  properties  rather  than 
the  psychological  properties  because  they  have  found  the  physical 
properties  to  be  more  easily  reproduced  in  experiments.  That 
they  can  be  measured  objectively  is  certainly  significant,  since 
it  eliminates  a  dependence  on  a  very  imperfectly  understood 
entity,  human  psychology.  Even  more  importantly  however,  the 
physical  properties  have  been  found  to  be  part  of  a  highly 
integrated  system  of  laws  and  principles  that  have  been  very  pro¬ 
ductive  in  understanding  the  world.  In  other  words,  these  physi¬ 
cal  properties  have  great  practical  value. 

How  does  this  apply  to  complexity  measures?  We  can  of 
course  try  to  understand  the  phenomenon  of  psychological  complex¬ 
ity;  this  is  a  fruitful  area  of  research  for  psychologists.  Our 
analogy  suggests,  however,  that  there  is  another  useful  notion  of 
complexity,  that  there  may  be  a  non-psychological  measure  of  com¬ 
plexity.  This  paper  has  presented  one  such  measure.  Whether  it 
turns  out  to  be  the  "right"  measure  or  not  will  depend  largely  on 
whether  it  can  be  integrated  into  a  comprehensive,  practical 


theory.  Such  an  integration  should  also  resolve  some  of  the 
measurement  ambiguities,  such  as  how  the  delimited  sequence 
operator  should  be  counted  in  measurements.  In  the  meantime  it 
must  remain  as  one  possible  notion  of  complexity.  We  should  not 
be  suprised  at  this  state  of  affairs;  it  took  many  years  for  phy¬ 
sicists  to  settle  on  definitions  of  work,  force,  mass,  etc. 
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